# Diffusion Transformer
Normal Lora
Other
ICEdit is an innovative instruction-based image editing method that achieves efficient editing through large-scale diffusion transformers, requiring only 0.5% of training data and 1% of parameter scale to achieve SOTA results.
Image Generation Supports Multiple Languages
N
RiverZ
26.58k
40
Textflux
TextFlux is a high-fidelity multilingual scene text synthesis model based on an OCR-free diffusion transformer. It uses FLUX.1-Fill-dev as the base model and focuses on the scene text synthesis task.
Image Generation
T
yyyyyxie
284
2
Megatts3
Apache-2.0
MegaTTS 3 is a zero-shot speech synthesis model based on sparsely-aligned enhanced latent diffusion Transformer, supporting both English and Chinese speech synthesis.
Speech Synthesis Supports Multiple Languages
M
RedbeardNZ
26
0
Dit Wikiart Large
MIT
A diffusion transformer model trained on the Wikiart dataset for generating artwork images
Image Generation
D
kaupane
35
0
Dit Wikiart Small
MIT
A diffusion transformer model trained on the Wikiart dataset for generating artistic style images
Image Generation
D
kaupane
29
0
Infiniteyou
InfiniteYou (InfU) is an identity-preserving image generation framework based on the FLUX Diffusion Transformer (DiT), capable of flexible image reshaping while maintaining identity features.
Image Generation English
I
ByteDance
12.57k
591
Hunyuan3d 2
Other
An advanced 3D synthesis system developed by Tencent, capable of generating high-resolution textured 3D assets from images or text
3D Vision Supports Multiple Languages
H
tencent
490.00k
1,314
Transpixar
Apache-2.0
TransPixar is a text-to-video generation model capable of producing RGBA videos with transparency (alpha channel)
Video Processing
T
wileewang
95
37
Rdt 170m
MIT
RDT-170M is a 170-million-parameter imitation learning diffusion Transformer model designed for robot vision-language-action tasks.
Multimodal Fusion
Transformers English

R
robotics-diffusion-transformer
278
7
Ominicontrol
OminiControl is a general-purpose control model based on Diffusion Transformer, focusing on image-to-image tasks.
Image Generation
O
Yuanshi
6,390
139
Rdt 1b
MIT
A 1-billion-parameter imitation learning diffusion Transformer model pretrained on 1M+ multi-robot operation data, supporting multi-view visual-language-action prediction
Multimodal Fusion
Transformers English

R
robotics-diffusion-transformer
2,644
80
Pixart LCM XL 2 1024 MS
PixArt-LCM is a text-to-image generation model based on the diffusion Transformer, combining the advantages of Pixart-α and LCM. It can quickly generate high-quality images according to text prompts.
Image Generation
P
PixArt-alpha
625
60
Featured Recommended AI Models